Phonetic events from the labeling the european portuguese database for speech synthesis, FEUP/IPBDB
نویسندگان
چکیده
In this paper a labeled new speech signal database (FEUP/IPBDB) in Standard European Portuguese (hereafter SEP) is presented. The objective of this work is, on one hand, to provide phonetic material for Text-to-Speech (TTS) systems construction, either from the start or to improve the quality of existing ones, and, on the other hand, to place at service of the SEP scientific community a phonetically and prosodically valuable speech corpus, essential for Speech Synthesis or Phonetics research. Our purpose is to make it available for the scientific community, since there isn’t any other DB of its kind for EP. The main features of the database will be described as well as some basic statistical aspects. A discussion of some methodological problems and some observed phenomena in experimental phonetics deriving from the speech signal labeling is also done. The approach in our work is to produce a resource that can be further improved in subsequent steps with minimal re-work. The phonetic, linguistic and technical consistency are guaranteed through the involvement of a multidisciplinary team.
منابع مشابه
HESITA(te) in Portuguese
Hesitations, so-called disfluencies, are a characteristic of spontaneous speech, playing a primary role in its structure, reflecting aspects of the language production and the management of inter-communication. In this paper we intend to present a database of hesitations in European Portuguese speech HESITA as a relevant base of work to study a variety of speech phenomena. Patterns of hesitatio...
متن کاملPhonetically Transcribed Speech Corpus Designed for Context Based European Portuguese TTS
This paper presents a speech corpus for European Portuguese (EP), designed for context based text-to-speech (TTS) synthesis systems. The speech corpus is intended for small footprint engines and is composed by one sentence dedicated to each sequence of two phonemes of the language, incorporating as many language contexts as possible at diphone and word levels. The speech corpus is presented in ...
متن کاملOn the Identification of Word-Boundaries using Phonological Rules for Speech Recognition and Labeling
In this paper we studied the phonemic structure of the words’ beginnings and endings in standard European Portuguese (hereafter EP). The generativist description of the Portuguese phonology [1] was used as framework basis and the phonetic and acoustic experiments performed by Delgado-Martins [2] served as a model to the phonetic background in EP. We also compared the results between the expecte...
متن کاملAutomatic Phonetic Segmentation and Labelling of Spontaneous Speech
In this paper a tool for automatic segmentation and labeling of spontaneous speech is presented. It is developed and specially tuned for the European Portuguese (EP) language but simple changes are needed to convert it to other languages. The main purpose of this system is to quickly produce a high quality output of phonetic labels and related time boundaries using as input the speech signal on...
متن کاملUnit Selection Speech Synthesis Using Phonetic-Prosodic Description of Speech Databases
This paper describes an approach to speech synthesis based on using speech databases at different stages of TTS process. Speech database units are phones in different segmental and prosodic contexts. Pitch synchronous segmentation and labeling of databases allows storing both segmental and prosodic information. Phonetic-prosodic annotations of speech databases are involved in off-line training ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001